Geographically weighted regression analysis of anemia and its associated factors among reproductive age women in Ethiopia using 2016 demographic and health survey

Introduction Anemia in reproductive age women is defined as the hemoglobin level <11g/dl for lactating or pregnant mothers and hemoglobin level <12 g/dl for none pregnant or non-lactating women. Anemia is a global public health problem affecting both developing and developed countries. Therefore this study aims to determine geographically weighted regression analysis of anemia and its associated factors among reproductive age women in Ethiopia using the 2016 Demographic and Health Survey. Method In this study, a total of 14,570 women of reproductive age were included. Multi-level binary logistic regression models were employed using STATA version 14. Odds ratio with a 95% confidence interval and p-values less than 0.05 was used to identify significant factors. Spatial scan statistics were used to identify the presence of anemia clusters using Kulldorf’s SaTScan version 9.6 software. ArcGIS 10.7 software was used to visualize the spatial distribution and geographically weighted regression of anemia among reproductive age women. Result Overall 23.8% of reproductive-age women were anemic. The SaTScan spatial analysis identified the primary clusters’ spatial window in Southeastern Oromia and the entire Somali region. The GWR analysis shows that having a formal education, using pills/injectables/implant decreases the risks of anemia. However, women who have more than one child within five years have an increased risk of anemia in Ethiopia. In addition to these, in multilevel analysis women who were married and women who have >5 family members were more likely to have anemia. Conclusion In Ethiopia, anemia among reproductive age women was relatively high and had spatial variations across the regions. Policymakers should give attention to mothers who have a low birth interval, married women, and large family size. Women’s education and family planning usage especially pills, implants, or injectable should be strengthened.


Introduction
Anemia in reproductive age women is defined as the hemoglobin level <11g/dl for lactating or pregnant mothers and hemoglobin level <12 g/dl for none pregnant or non-lactating women. Anemia is a global public health problem affecting both developing and developed countries. Therefore this study aims to determine geographically weighted regression analysis of anemia and its associated factors among reproductive age women in Ethiopia using the 2016 Demographic and Health Survey.

Method
In this study, a total of 14,570 women of reproductive age were included. Multi-level binary logistic regression models were employed using STATA version 14. Odds ratio with a 95% confidence interval and p-values less than 0.05 was used to identify significant factors. Spatial scan statistics were used to identify the presence of anemia clusters using Kulldorf's SaTScan version 9.6 software. ArcGIS 10.7 software was used to visualize the spatial distribution and geographically weighted regression of anemia among reproductive age women.

Result
Overall 23.8% of reproductive-age women were anemic. The SaTScan spatial analysis identified the primary clusters' spatial window in Southeastern Oromia and the entire Somali region. The GWR analysis shows that having a formal education, using pills/injectables/ implant decreases the risks of anemia. However, women who have more than one child within five years have an increased risk of anemia in Ethiopia. In addition to these, in multilevel analysis women who were married and women who have >5 family members were more likely to have anemia. Introduction anemia focusing on pregnant women by supplying iron (Fe) and folic acid, proper nutrition, education, deworming, promoting sanitation, and preventing and treating anemia. However, in the last 15 years, the trend of anemia has remained inconsistent [11]. Even though the above intervention has been taken, the prevalence of anemia among reproductive age women in Ethiopia is still high [6,7].

Study design and setting
The study used population-based cross-sectional survey data from 2016 Demographic Health Surveys conducted in Ethiopia. Ethiopia (3 0 -14 0 N and 33 0 -48 0 E) is located in the horn of Africa. The country covers 1.1 million Sq. Kilometers, with huge geographic diversity: from 4550m above sea level to 110m below sea level in Afar depression. There are nine regional states(Amhara, Afar, south nation nationality and peoples, Gambela, Benshangul Gumuz, Harari, Oromia, Somalia, and Tigry) and two city administrations (Addis Ababa and Dire Dawa). These areas are divided into 68 zones, 817 districts, and 16,253 kebeles (lowest local administrative units of the country) in the administrative structure of Ethiopia [12].

Source and study population
The source population was all women aged 15 to 49 within five years before the survey in Ethiopia, while all reproductive-age women in the selected enumeration areas were the study population. EDHS uses a two-stage stratified cluster sampling method, using the 2007 Population and Housing Census as the sampling frame. First, 645 enumeration areas (EA) were chosen with a probability proportionate to their size, and an independent sample was drawn at each sample level. And then 28 households were systematically selected on average. Hemoglobin level was done for 14,489 women and of them, 14,171 women were usually live in the surveyed households (de juries) and included in the study. Therefore, the final analysis in " Fig 1" uses a total weighted sample of 14,570 women. The data collection took place from 18 January 2016 to 27 June 2016.

Outcome variable
The current study is based on the altitude adjusted hemoglobin levels which were already reported in 2016 EDHS data. Anemia is defined as the hemoglobin level <11 g/dl for lactating or pregnant mothers and hemoglobin level <12 g/dl for none pregnant or non-lactating women [1]. [13]. For community mass media exposure we have used 13.8% and also for community women's education level we used 7.7% [6]. The normal distribution of aggregated community factors was assessed by histogram and Shapiro Wilks test but, they didn't fulfill the normality assumption then we recode them based on the median value.

Data processing and analysis
We accessed the data sets using the website www.measuredhs.com after the rational request of the Demographic and health survey (DHS). The geographic coordinate data (latitude and longitude coordinates) were also taken from selected enumeration areas through the web page of the international DHS program. The required data treatment and cleaning process was made using Stata version 14 statistical software. Descriptive analyses were used to explain the prevalence of anemia among WRA groups. Before performing spatial analysis, the weighted proportion (using sample weight) of anemia among WRA and candidate explanatory variables data were exported to ArcGIS.

Model building
Due to the hierarchical nature of the 2016 EDHS data, where individuals are nested within the community, the assumptions such as independent of observations and equality of variance have been violated. Therefore multilevel binary logistic regression was fitted for the study of determinants of anemia among reproductive age women. Four models were used in the multilevel analysis. The first model contained only the outcome variable which was used to check the proportion of anemia among WRA variability in the community. The second models contain only individual-level variables and the third model contains only community-level variables, whereas, in the fourth model, both the individual and community-level variables were adjusted simultaneously with the outcome variables. Model comparison was done using the loglikelihood ratio test and the fourth model, which has the highest log-likelihood ratio was selected as the best fit model.

Parameter estimation method
Both random effect and fixed effect model parameters were included in the model. Random-effects estimates the variation of prevalence of anemia among reproductive age women between clusters. We used the cluster number variable (v001) for random effect estimates. We estimated the intraclass correlation coefficient (ICC), the median odds ratio (MOR), and Proportional Change in Variance (PCV). The intraclass correlation coefficient (ICC) reveals that, the variation of anemia among reproductive age women due to the cluster difference. ICC ¼ VA VAþ3:29 �100%, where; VA = area/cluster level variance [14][15][16]. The MOR can be understood as the increased risk (in median) that would have if moving to another area with a higher risk [16].
The PCV reveals the variation in anemia among reproductive age women which is explained by all factors. The PCV is calculated as; PCV ¼ VnullÀ VA V null �100% where; Vnull = variance of the first model, and VA = variance of the model with more terms [14,16].
The fixed effect assesses the relationship between the possibilities of anemia among women of reproductive age and predictors. For the final model, factors with a p-value � of 0.2 in crude odds ratio (COR) were selected. Associations between outcome and explanatory variables were assessed and its strength was presented using adjusted odds ratios with 95% confidence intervals with a P-value of <0.05 cut point.

Spatial analysis
For spatial analysis, Arc GIS 10.7 and SaTScan version 9.6 software were used. A statistical measurement of spatial autocorrelation (Global Moran's I) is used for the assessment of the spatial distribution of anemia among WRA in Ethiopia [17]. Hot Spot Analysis (Getis-Ord Gi � statistic) represents the cluster characteristics with hot or cold spot values spatially. Whereas the ordinary Kriging spatial interpolation technique is used to predict the proportion of anemia among WRA for unsampled areas in the country based on sampled EAs. Bernoullibased model spatial scan statistics were employed to determine the geographical locations of statistically significant clusters for the prevalence of anemia among WRA. To fit the Bernoulli model, cases were taken from the scanning window that moves across the study area in which women had anemia, and controls were taken from those women who had no anemia. The default maximum spatial cluster size of < 50% of the population was used as an upper limit, allowing both small and large clusters to be detected. The primary, secondary, and other significant clusters were identified and ranked based on the likelihood ratio test (LLR) test using 999 replications of Monte Carlo. The circle with the highest statistic in the LLR test is defined as the most likely (primary) clusters, that is, the group with the least random occurrence.

Ordinary least square analysis
The ordinary least square analysis was done using variables that were found to be significant at the final multilevel model. The Ordinary Least Square regression (OLS) model is a global model that predicts only one coefficient per independent variable over the entire research area. Then, the model performance, as well as the model significance such as VIF, R-square, Koenker, and Jarque-Bera statistics, expected sign for coefficients, and spatial autocorrelation of residuals were checked.
The model structure of ordinary least square analysis equation [18] is written as, where i = 1,2,. . .n; β 0 , β 1 , β 2 , . . .β p are the model parameters, y i is the outcome variable for observation i, x ik are explanatory variables and ε 1 , ε 2 , . . . ε n are the error term/residuals with zero mean and homogenous variance σ 2

Geographically weighted regression analysis
Unlike OLS that fits a single linear regression equation to all of the data in the study area, GWR creates an equation for each coefficient. The model structure of geographically weighted regression equation [19] is written as, are p unknown functions of geographic locations (u i v i ), x ik are explanatory variables at the location (u i, v i ), i = 1,2,. . .n and ε i are error terms/residuals with zero mean and homogenous variance σ 2 . The OLS and GWR models were compared using different parameters. Finally, the coefficients which were created using GWR were mapped.

Ethical considerations
The permission for access to the data was obtained from ICF International by registering and stating the purposes of the study. The data set has no household addresses or individual names. The data were used for the registered research topic only and were not shared with other subjects. All the data were fully anonymized before we accessed them and/or the ICF International waived the requirement for informed consent. There were no medical records used in the research since it was a demographic and health survey.

Sociodemographic characteristics
From the total weighted 14570 reproductive-age women, the mean ± standard deviation (SD) of the respondents' age was 28 ± 9 years. 4,572 (31.38%) women were lactating and 1,069 (7.34%) were pregnant. More than a quarter 25.6% of the respondents were current contraceptive users " Table 1".

The trend of anemia among reproductive age women in Ethiopia
The prevalence of anemia among reproductive-age women decreased from 27% in 2005 to 17% in 2011, but it increased to 23.  Multi-level analysis of factors associated with anemia among reproductive age women Random effect and model comparison. The ICC value in the null model of Table 2 showed that 18% of variations of anemia among reproductive-age women were expressed by cluster level factors. The MOR value in the null model also showed that anemic among reproductive-age women were different by 2.28 times between higher and lower prevalence clusters. Moreover, the final model PCV value showed that both the community and individual level factors explained about 40.2% of the variation of anemia among WRA. The deviance and likelihood ratio tests were used to compare and fit the models, and the model with the lowest deviance value and the highest likelihood ratio value which mean Model 4 was the better-fitted model " Table 2".
Fixed effect outputs. In multi-level analysis outputs of the final model (model 4), variables such as marital status of women, education status of women, wealth index of the household, family size, hormonal contraceptive usage, and region they live had a significant association with anemia among reproductive age women.
The odds of developing anemia among reproductive age women who were married are 1.  Spatial analysis results of anemia among reproductive-age women in Ethiopia (EDHS 2016) Spatial distribution, incremental and spatial autocorrelation analysis. The spatial distribution of anemia among reproductive-age women in Ethiopia shows significant spatial variation across the country. In Afar, Somali, and Dere Dewa regions have a high prevalence of anemia among WRA whereas B/gumiz, Amhara and SNNPR region had low prevalence " Fig  4". Anemia among reproductive-age women was shown to be spatially clustered in Ethiopia,   with a Global Moran's I value of 0.38 (p 0.001). The Z-score of 23.35 indicated that there is less than 1% likelihood that this clustered pattern could result from random chance " Fig 5". The peak distance with statistically significant z-scores on which spatial processes promoting clustering are most pronounced indicated at 151.4 Km; 20.82(distances; Z-score) and 195.8Km; 21.25 (distances; Z-score) " Fig 6". Hot and cold spot analysis. The figure below showed that, the more intense clustering of high (hot spot) proportion anemia among reproductive age women which represent by the red dots. It was clustered at the Somali, Dire Dewa, and Afar regions of Ethiopia. Whereas, Amhara, SNNPR, and Tigray regions of Ethiopia were fewer risk areas which represents by blue dots "Fig 7". Spatial sat scan analysis. There were most likely primary and secondary significant clusters of anemia among WRA. There were a total of 198 significant clusters found, 50 of these were the most probable primary clusters, whereas the remaining 43 were secondary clusters. The primary clusters' spatial window was located in the Somali, Southeastern Oromia region which was centered at 6.023458 N, 44.807507 E with 462.80 km radius, and Log-Likelihood ratio (LLR) of 206.7, at p < 0.001. It showed that in the primary clusters women within the spatial window had 2.33 times higher risk of anemia than women outside the window whereas in the secondary cluster it was 2.37 times higher risk " Table 4" and " Fig 8".
Kriging interpolation. Using ordinary kriging interpolation of anemia among reproductive-age women, continuous images have been produced. The predicted anemia among reproductive-age women over the area increases from green to red-colored, which means the red color indicates high-risk areas of predicted, and the green color indicates the predicted lowrisk area of anemia among reproductive-age women. Based on this Somali, Afar and southern parts of the Oromia regions were predicted as riskier than other regions " Fig 9".

Factors affecting the spatial variation of anemia among reproductive-age women (modeling spatial relationships)
Ordinary least square regression (OLS). As shown in Table 5 the OLS model explained about 35.5% (Adjusted R square = 0.355) of the spatial variation in anemia among reproductive-age women. The coefficients represent the strength and the type of each explanatory variable and the anemia WRA Since the Koenker (BP) statistic was significant, we used the robust probability to determine the statistical significance of the coefficients and the coefficients of women who have formal education, women who use Pills/injectable/implant and women have more than one child within five years were statistically significant (p< 0.01). The Joint Wald statistic was statistically significant (p< 0.01) and this shows that the overall model was significant and also there is no multicollinearity between explanatory variables (Variance inflation factor (VIF) < 7.5). In addition, the Spatial Autocorrelation test (Moran's I = 0.21, P< 0.01) revealed that residuals were spatially autocorrelated " Table 5". Geographically weighted regression. GWR improves the OLS global model in the case of nonstationarity between predictors and anemia among WRA.
As shown in Tables 5 and 6, the higher the adjusted R square, the lower Akaike's Information Criterion (AICc) value obtained from the GWR model (as compared to the OLS model) helps us to move from a global model (OLS) to a local regression model (GWR). That is conducting the GWR improves the model "Tables 5 and 6". Figs 11-13 demonstrate the geographical areas where the explanatory variables (Attending formal education, using pills/injectable/implant contraceptives, and having more than 1 child within five years) were strong and weak predictors of anemia among WRA in Ethiopia.
Being mothers with formal education had a negative relationship with anemia among WRA. The red-colored clustered points (found in western parts of Amhara, SNNPR, and entire Gambela) indicate areas where the coefficients were largest, which in turn indicates the strong negative relationship between attending formal education and anemia among WRA " Fig 11".
As shown in Fig 12 mothers who use pills, injectable contraceptives and implants showed a strong negative relationship with anemia among WRA in northern Tigray, northern and eastern SNNPR region, and Addis Abeba " Fig 12".

PLOS ONE
Women who have more than one child within five years have a positive relationship with anemia among WRA in eastern Amhara, western Afar, and Somalia region " Fig 13".

Discussion
Because of their high demand for iron during pregnancy, lactation, monthly bleeding, and nutritional deficiencies, anemia is a serious public health problem among reproductive-age women [2,10]. This study investigated the prevalence and related factors of anemia in women of reproductive age in Ethiopia evidence on EDHS 2016 using geographically weighted regression analysis.
When this result is compared with the previous EDHS report, it is lower than 2005 EDHS (27%) but higher than 2011 EDHS(17%) reports [27]. This might be due to the difference in intervention approaches and performance taken by the Ethiopian government. Moreover, the number of reproductive age women included in each EDHS might have its effect [10,27].   Reproductive age women who had a family size greater than five have higher odds of anemia as compared to having a family size less than two. This finding is in line with different studies [6,9]. This could be to the possibility that a large family size leads to food insecurity in the home, jeopardizing women's access to a healthy diet.
In this study reproductive-age women who were married have more likely to have anemia as compared to all other marital statuses. This is in line with a study conducted in Ethiopia [28], East Africa [22], but different from the 2005 EDHS report [29], and Ruanda [30]. This might be most married women become pregnant and lactating, and the complications that result may make them anemic.
In this study reproductive age women who use pills, injectable or implant contraceptives were less likely to be affected by anemia. This is in line with the study in East Africa, Ruanda, Nepal [22,25,30]. This is because women who used this type of contraceptive method with high efficiency to prevent pregnancy result in complications related to pregnancy and childbirth.
By themself hormonal contraceptive methods could minimize menstrual bleeding, besides, the noncontraceptive iron content pills are also used for the prevention of heavy menstrual bleeding and regulating menses [6].   The odds of having anemia among reproductive age women who gave birth to more than one child within five years were higher than not having birth within a specified period. This finding is in line with a study in Ethiopia [9], India [8] might be that narrow birth interval delays the restoration of iron and other micronutrient stores in the body between pregnancies, and also women with frequent birth history could have a history of obstetric complications such as postpartum hemorrhage and sepsis which directly expose them to anemia [1,31]. In this study, reproductive age women who were rich are less likely to have anemia when compared to poor and poorest women. This is in line with different studies [15,22,30,32,33]. This could be due to that when a woman has improved her wealth status, she enables to purchase healthy nutrition and can utilize health services [33].
Reproductive age women who were attained primary and more than primary education were less likely to be affected by anemia. This is in line with a study in Ethiopia [6], Rwanda [30]. This might be that women who have education have higher health-seeking behavior and service utilization than non educated women so that they couldn't get preventive and curative services for conditions that contribute to anemia.
The finding of this study showed that anemia among reproductive age women had significant spatial variation in Ethiopia. The spatial SaTscan statistics detected a total of 198 significant clusters with a high prevalence of anemia among reproductive age women. The Somali, Dire Dewa, and Afar regions of Ethiopia were shown to have significant hotspot areas of anemia among WRA. Whereas, Amhara, SNNPR, and Tigray regions of Ethiopia were less risk areas. Studies conducted in Ethiopia [7,34,35] and other developing countries [36,37] also pointed out the significant regional variations in the use of anemia among WRA. Moreover, the multilevel result revealed that the odds of having anemia among reproductive age women who were living in Somalia and the Afar region were higher as compared to the Tigray region. This is in line with different studies conducted in Ethiopia [7,34], Tanzania [36]. This might result from differences in dietary preferences and disease burden, inequalities in access to health care across the regions and differences in societal beliefs, cultural practices towards the care for women. and recurrent drought may be triggered food insecurity might have contributed to the higher prevalence of anemia in these regions [6].
The GWR analysis revealed that there is a negative relationship between women having formal education and women who use Pills/injectables/implants with anemia among WRA. However, women who had more than 1 child within five years were more likely to have anemia in multiple regions of Ethiopia. The findings from the GWR analysis were similar to the multilevel analysis conducted in this study.
The utilization of nationally representative data with a high sample size was the study's key strength. Another advantage was that performing multilevel analysis to adjust for the data's correlated nature. The use of spatial analysis including modeling spatial relationships using GWR was also another strength of this study which was used to identify factors that contributed to spatial variation of anemia among WRA. This research, however, has certain flaws. We weren't able to include crucial elements such as hookworm infestation and diet type since we used secondary data.

Conclusion
In Ethiopia, anemia among reproductive-age women had spatial variations across the regions. The GWR analysis shows that mothers having a formal education, and women who use Pills/ injectable/implant decreases the risks of anemia among reproductive-age women. However, women who have more than one child within five years increased the risk of anemia among reproductive-age women in Ethiopia. Therefore, it is important that the Ethiopia FMoH pay special attention to those groups of women who have a higher prevalence of anemia, such as an increased number of births, being married, and increase family size is recommended. Women's education and family planning usage especially pills, implants, or injectable should be strengthened. Anemia prevention and control programmers should be a strength for WRA living in high anemic areas such as in Afar, Somali and Dire Dawa regions.